Phase 1b: MSE-optimal bandwidth selector (CCF 2018 DPI)#335
Conversation
Port nprobust 0.5.0 (SHA 36e4e53) lpbwselect(bwselect="mse-dpi") in-house as diff_diff.mse_optimal_bandwidth + BandwidthResult, backed by private diff_diff._nprobust_port module (kernel_W, lprobust_bw, lpbwselect_mse_dpi). Three-stage DPI with four lprobust.bw calls at orders q+1, q+2, q, p. Parity verified at 0.0000% on all five stage bandwidths (c_bw, bw_mp2, bw_mp3, b_mse, h_mse) across three deterministic DGPs (uniform, Beta(2,2), half-normal) via benchmarks/R/generate_nprobust_golden.R. weights= raises NotImplementedError (no nprobust parity anchor); deferred to Phase 2. vce='nn' is the only verified variant; hc0/hc1/hc2/hc3 paths are implemented but untested. 144 tests pass (32 new bandwidth-selector/port + 55 existing local_linear regression + others). Plan at ~/.claude/plans/vectorized-beaming-feather.md was approved with 1% parity target; actual port achieves essentially exact parity. Phase 1b checkbox ticked in docs/methodology/REGISTRY.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1 fixes from local AI review: - Add bwcheck validation in lpbwselect_mse_dpi: reject bwcheck < 1 and bwcheck > N with clear ValueError instead of IndexError inside the sorted-abs indexing. - Export BandwidthResult and mse_optimal_bandwidth from diff_diff/__init__.py::__all__ so the public API propagates through star imports and introspection. - Fix contradictory docstring in lpbwselect_mse_dpi: the HAD case (p=1, deriv=0) has even=False, so the R source dispatches to the closed-form C$bw branch, NOT to optimize(). Docstring now matches the actual control flow. - 4 new tests in test_bandwidth_selector covering bwcheck > N, bwcheck=0, bwcheck=-1, and bwcheck=None on a 5-obs sample. 148 tests pass (up from 144). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AI review flagged that the public wrapper silently hard-codes vce="nn", nnmatch=3, p=1, deriv=0, interior=False while the underlying port supports a broader surface. Two options: expose the extra parameters or document the restriction. Chose to document since Phase 1b plan explicitly scopes those paths as untested / deferred. - mse_optimal_bandwidth docstring now has a "Public API scope" section describing the HAD Phase 1b restriction and pointing users who need the broader surface at diff_diff._nprobust_port.lpbwselect_mse_dpi. - REGISTRY.md Phase 1b entry gains a "Note (public API scope restriction)" documenting the same contract. - New test test_public_wrapper_fixes_vce_nn_nnmatch_3 pins the restriction: the wrapper's output must equal the explicit port call with vce="nn", nnmatch=3, p=1, deriv=0, interior=False. Any future scope expansion now fails the test and forces a REGISTRY/docstring update in lockstep. 150 tests pass (up from 148). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
P1 #1 (methodology): mse_optimal_bandwidth now rejects boundary > d.min() with a clear ValueError. The Phase 1b wrapper is scoped to the HAD lower-boundary case (Design 1' with d_0 = 0 or Design 1 continuous-near- d_lower with d_0 = min D_2). Interior or upper-boundary inputs would silently run the boundary selector with a symmetric kernel and return a bandwidth incompatible with the one-sided fitter. The port remains available for interior / broader surface via _nprobust_port.lpbwselect_mse_dpi. P1 #2 (code quality): lprobust_bw validates in-window observation counts at each of the three local-poly fits before calling qrXXinv: - variance: n_V >= o+1 - B1: n_B1 >= o_B+1 - B2: n_B2 >= o_B+2 Each guard raises a targeted ValueError naming the failing stage, the bandwidth, and suggested remediation. Previously these failed with opaque LinAlgError from Cholesky on under-determined designs. P3 (doc): local_linear.py module docstring updated to say Phase 1b "ships" instead of "will add"; tiny-sample test now asserts the new ValueError contract instead of accepting any non-IndexError failure. New behavioral tests: - test_interior_boundary_rejected: boundary=0.5 on U(0,1) rejected - test_upper_boundary_rejected: boundary=d.max() rejected - test_boundary_equal_to_min_d_accepted: boundary=min(d) accepted (Design 1 continuous-near-d_lower path) - test_boundary_below_min_d_accepted: boundary=0 with d.min()>0 accepted (Design 1' path) - test_bwcheck_none_on_tiny_sample_raises_valueerror: upgraded from "catch anything non-IndexError" to pytest.raises(ValueError, match="lprobust_bw"). 153 tests pass (up from 149). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology Affected methods: Calonico-Cattaneo-Farrell (2018) MSE-DPI bandwidth selection for local-polynomial boundary estimation, as used in the HAD nonparametric Equation 7/8 path.
Code Quality
Performance No findings. Maintainability No findings. Tech Debt No findings. The two P1 issues above are not tracked in TODO.md, so they remain unmitigated blockers rather than accepted deferred work. Security No findings. Documentation/Tests
Path to Approval
|
P1 #1 (methodology): mse_optimal_bandwidth now rejects Design 1 mass-point designs. When boundary > 0 and the modal fraction at d.min() exceeds the REGISTRY-specified 2% threshold, raise NotImplementedError pointing to the 2SLS sample-average estimator per de Chaisemartin et al. (2026) Section 3.2.4. Design 1' with untreated units at d=0 (boundary=0) is still accepted per Garrett et al. (2020) application precedent. P1 #2 (code quality): qrXXinv now catches np.linalg.LinAlgError from Cholesky and re-raises as ValueError with a targeted message naming the failing dimension and suggesting remediation. Duplicate-support windows or other rank-deficient designs now fail with a clear error instead of leaking LinAlgError out of the port. P3 (tests): Added TestStageDiagnosticsParity::test_R_parity covering all four stages. Previously only V/B1/B2 were pinned; R (BWreg) was only trivially checked for stage_d1 (scale=0 -> R=0). Now stage_b and stage_h R values are explicitly parity-tested at 1% against R nprobust. New behavioral tests: - test_mass_point_design_rejected: 10% mass at 0.1 -> NotImplementedError - test_continuous_near_d_lower_accepted: uniform(0.1, 1.0) passes - test_untreated_at_zero_accepted: 15% at d=0 with boundary=0 passes - test_rank_deficient_design_raises_valueerror: rank-1 X -> ValueError - R parity on all four stages across 3 DGPs (12 new parametrized cases) 169 tests pass (up from 153). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality No findings. The earlier raw Performance No findings. Maintainability No findings. Tech Debt No findings. The blocking methodology issue above is not tracked in TODO.md:L51, so it remains unmitigated. Security No findings. Documentation/Tests
Execution note: I could not run Path to Approval
|
… test
P1 (methodology): The boundary=0 default path now classifies inputs
per the REGISTRY-planned design="auto" rule. Previously the
mass-point guard only fired for boundary>0, so a dataset with
d.min()>0 and mass at d.min() would silently pass through the
default-boundary path as Design 1'. The new rule:
- boundary=0 with min(d) < 0.01 * median(d): Design 1' accepted
(support infimum effectively at 0).
- boundary=0 with min(d) >= 0.01 * median(d):
* modal fraction at min(d) > 2%: mass-point design -> raise
NotImplementedError pointing to the 2SLS / Phase 2 path.
* otherwise: ambiguous -> raise ValueError asking the caller
to pass boundary=d.min() for the Design 1 continuous-
near-d_lower path.
- boundary>0 path unchanged (mass-point check already in place).
Removed the stale test_boundary_below_min_d_accepted: it used
U(0.01, 1) data which doesn't satisfy Design 1' under the stricter
rule. Replaced with three targeted tests:
- test_boundary_zero_design_1_prime_accepted: U(0, 1) passes.
- test_boundary_zero_with_positive_d_min_rejected: U(0.5, 1)
raises "Ambiguous design".
- test_boundary_zero_with_d_min_mass_point_rejected: mass at 0.1
with boundary=0 raises mass-point NotImplementedError.
P3 (tests): Added test_full_stack_rank_deficient_raises_valueerror
driving a 3-distinct-value dataset through both lpbwselect_mse_dpi
and the public wrapper; both must raise ValueError (or
NotImplementedError), never LinAlgError or IndexError.
172 tests pass (up from 169).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
Reviewer correctly flagged that the 1%-of-median rule is a Phase 2 design="auto" heuristic, not Phase 1b. Backed off that over-reach. P1 #1: Removed the min(d)/median(d) < 0.01 check. The mass-point guard now applies uniformly (whenever d.min() > 0 and modal fraction at d.min() > 2%) and does not gate on boundary. This still catches the original concern (silently routing mass-point data through the nonparametric branch) without rejecting valid Design 1' samples like Beta(2,2) where d.min() is strictly positive but small. P1 #2: Tightened boundary validation. The wrapper now accepts only boundary ~ 0 (Design 1') or boundary ~ d.min() (Design 1 continuous- near-d_lower) within float tolerance. Off-support values -- including the previously-allowed "boundary < d.min()" path -- are rejected with a targeted error message. P3: Added a public-wrapper duplicate-support regression that drives a rank-deficient X'X through the full selector stack (boundary = d.min(), unique minimum, only 4 distinct d values) and asserts a specific "qrXXinv" ValueError, not LinAlgError. Test updates: - Removed test_boundary_zero_with_positive_d_min_rejected: the case it modeled is now accepted (no mass point). - Added test_boundary_zero_thin_boundary_density_accepted: Beta(2,2) Design 1' with vanishing boundary density now passes. - Added test_off_support_boundary_rejected: boundary=0.5 on U(1,2). - Added test_negative_boundary_rejected: boundary<0 rejected. - Updated test_nonzero_boundary: uses boundary=float(d.min()), not boundary=1.0 (which is off the realized support of U(1,2)). 175 tests pass (up from 172). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings in the current diff. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
CI AI review P3: the docstring only listed shape/finite/kernel/weights raises but the implementation now also raises on boundary off-support, mass-point, bwcheck out-of-range, and per-stage rank/count failures. Document the full contract. 175 tests pass (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings in the current diff. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
CI AI review P3: the boundary parameter docs said callers could pass d_lower (the theoretical support lower bound), but the implementation requires boundary = float(d.min()) (the sample minimum). Users following the old docs on e.g. U(1, 2) data with boundary=1.0 would hit an avoidable ValueError. Docstring now explicitly says use the sample minimum. Deferred (P3): extending R golden parity to triangular and uniform kernels. All three kernels go through the same lprobust_bw code path, so epa parity transitively covers dispatch; tri/uni parity is a nice-to-have regression anchor but not a correctness gap. 175 tests pass (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology Affected methods:
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
CI AI review P1: mse_optimal_bandwidth did not enforce the HAD
support requirement D_{g,2} >= 0. Negative-dose samples could
silently pass through both boundary branches: boundary=0 (accepted
via _at_zero even with d_min<0) and boundary=float(d.min()) (accepts
any lower edge). The symmetric nprobust kernel would happily
calibrate a two-sided interior bandwidth while the downstream
one-sided fitter runs on [boundary, boundary+h] -- silent
assumption violation.
Front-door check added: np.any(d < 0) raises ValueError with a
message citing the paper's support assumption.
Two new regression tests:
- test_negative_dose_rejected_boundary_zero: d ~ U(-0.5, 0.5) with
boundary=0 raises.
- test_negative_dose_rejected_boundary_at_d_min: d ~ U(-1, -0.1)
with boundary=d.min() raises.
Deferred (P3 same as last round): tri/uni/shifted-boundary golden
parity extension. All three kernels share lprobust_bw, so epa parity
transitively covers kernel dispatch.
177 tests pass (up from 175).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
P1 #1: boundary=0 now enforces a Design 1' support plausibility heuristic: d.min() <= 5% * median(|d|). Samples with d.min() substantially positive (e.g. U(0.5, 1)) are rejected with ValueError directing the caller to boundary=float(d.min()). Threshold chosen at 5% (not REGISTRY's 1%) so the paper's thin-boundary-density DGPs (Beta(2,2), d.min/median ~ 3%) still pass. Reordered so the mass-point check (NotImplementedError, paper Section 3.2.4) fires before the support-check -- mass-point data should be redirected to 2SLS regardless of the boundary the caller picked. P1 #2: Empty-input front-door guard. d.size == 0 raises ValueError with a targeted "must be non-empty" message instead of leaking the NumPy reduction error from d.min(). P3 (docstring sync): _nprobust_port module docstring no longer says weighted data can be handled by the public wrapper -- the wrapper explicitly raises NotImplementedError. Docstring now matches the actual contract. P3 (deferred, same as last round): tri/uni/shifted-boundary golden parity extension. REGISTRY.md Phase 1b note expanded to document the full input contract (nonnegativity, boundary applicability, Design 1' support heuristic, mass-point redirection) so the public API surface is fully specified in the methodology registry. 178 tests pass (up from 177). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality
Performance No findings. Maintainability No findings. Tech Debt No findings. Security No findings. Documentation/Tests
Static review only: Path to Approval
|
CI AI review P1: the port's lpbwselect_mse_dpi is advertised as the advanced-use entry point for callers outside the HAD Phase 1b wrapper surface. It accepted x, y, cluster without validating shapes, and the default vce="nn" branch reindexed y/cluster by argsort(x) -- so a longer y would be silently truncated to match x's length, producing a bandwidth on misaligned data with no warning. Added front-door validation at the top of lpbwselect_mse_dpi: - x and y ravel()ed and required to have the same shape - x must be non-empty - x, y, and eval_point must be finite - cluster (if supplied) must match x.shape Seven new tests in TestLpbwselectMseDpiValidation: - test_mismatched_shapes_raise - test_longer_y_silent_truncation_rejected (regression for the specific nn-reindex bug) - test_cluster_wrong_length_rejected - test_empty_direct_port_input_rejected - test_non_finite_x_rejected - test_non_finite_y_rejected - test_non_finite_eval_point_rejected 185 tests pass (up from 178). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
Prior commit moved shape / emptiness / finiteness checks into the port but left bwcheck validation above them. As a result, empty or non-finite inputs got "bwcheck exceeds sample size" errors instead of the targeted contract messages the tests expect. Reorder so input-shape validation runs first, then bwcheck, then kernel/vce. Drop duplicate N assignment. 185 tests pass (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt No findings. Security No findings. Documentation/Tests
Path to Approval
|
CI AI review P1: the advanced-use lpbwselect_mse_dpi accepted cluster arrays with NaN / None silently. Downstream, clustered meat construction groups rows via cluster == c; NaN compares unequal to itself, so missing-cluster rows would drop from every block and the meat matrix would change silently, altering the selected bandwidth. nprobust::lpbwselect does complete-case filtering on (x, y, cluster) before dispatch. This port deliberately chooses the stricter "reject" policy consistent with the library's no-silent-failures convention: missing cluster IDs raise ValueError up front. The "reject vs. filter" deviation from nprobust is documented in the module docstring Deviations list. Handles both NaN in float arrays and None in object-dtype arrays. Two new regression tests: - test_missing_cluster_id_rejected_nan - test_missing_cluster_id_rejected_none 187 tests pass (up from 185). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
CI AI review P1: my port of lprobust.vce multiplied the accumulated cluster meat by w = ((n-1)/(n-k)) * (g/(g-1)). But the R source (npfunctions.R:165-185) computes w but does NOT apply it to the returned M -- w is dead code in the R function. My w factor would silently inflate clustered stage variances and drift the DPI bandwidths away from nprobust. The public wrapper is unaffected (it hard-codes cluster=None), but the documented advanced-use entry point lpbwselect_mse_dpi supports cluster= and would have produced non-faithful results. Fix: drop the w multiplier so lprobust_vce returns the raw sum-of-outer-products of cluster scores, matching R. A comment documents that w is computed-but-unused in R so future maintainers don't reintroduce the multiplier thinking it was a port omission. Three new tests in TestLprobustVceClustered: - test_clustered_meat_matches_unscaled_sum: bit-exact vs manual sum without the w factor - test_clustered_is_symmetric: meat is symmetric - test_clustered_end_to_end_through_lprobust_bw: smoke-tests the clustered DPI path end-to-end P3 (stale comment): test_boundary_zero_with_data_far_from_zero_rejected docstring and comment referenced "1% of median" but the implemented rule is 5%. Aligned with the code. P3 (same as last 4 rounds): tri/uni kernel golden parity extension. 190 tests pass (up from 187). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Previous phrasing "parity verified at 0.0000%" reads as "0% agreement" at a glance. What we actually have is 0.0000% relative error -- i.e., bit-parity within float64 precision. Clearer wording. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
CI's "Copy tests to isolated location" step runs tests against the
installed package from /private/tmp/tests/, but does NOT copy
benchmarks/ alongside. My parity tests hard-failed with
FileNotFoundError when the golden JSON was not present at the
expected relative path.
Follow the established repo convention for golden-value fixtures:
if not GOLDEN_PATH.exists():
pytest.skip("Golden values file not found; run: Rscript ...")
Matches the pattern used by test_csdid_ported.py,
test_chaisemartin_dhaultfoeuille_parity.py, test_survey_real_data.py,
test_survey_estimator_validation.py, and test_linalg_hc2_bm.py.
Parity tests still run as hard gates:
- when invoked from the repo root (local /pre-merge-check,
/ai-review-local, dev iteration)
- when benchmarks/ is present alongside tests/ in CI jobs
They skip in the isolated-install job where only tests/ is copied.
Net effect matches the shipping convention for all other
R-backed parity suites in this repo.
190 tests pass (unchanged).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-failures audit Packages 161 commits across 18 PRs since v3.1.3 as minor release 3.2.0. Per project SemVer convention, minor bumps are reserved for new estimators or new module-level public API — BusinessReport / DiagnosticReport / DiagnosticReportResults (PR #318) add a new public API surface and drive this bump. Headline work: - PR #318 BusinessReport + DiagnosticReport (experimental preview) - practitioner- ready output layer. Plain-English narrative summaries across all 16 result types, with AI-legible to_dict() schemas. See docs/methodology/REPORTING.md. - PR #327, #335 did-no-untreated foundation - kernel infrastructure, local linear regression, HC2/Bell-McCaffrey variance, nprobust port. Foundation for the upcoming HeterogeneousAdoptionDiD estimator. - PR #323, #329, #332 dCDH survey completion - cell-period IF allocator (Class A contract), heterogeneity + within-group-varying PSU under Binder TSL, and PSU-level Hall-Mammen wild bootstrap at cell granularity. - PR #333 performance review - docs/performance-scenarios.md documents 5-7 realistic practitioner workflows; benchmark harness extended. Silent-failures audit closeouts (PRs #324, #326, #328, #331, #334, #337, #339) continue the reliability work started in v3.1.2-3.1.3 across axes A/C/E/G/J. CI infrastructure: PRs #330 and #336 exclude wall-clock timing tests from default CI after false-positive flakes; perf-review harness is the principled replacement. Version strings bumped in diff_diff/__init__.py, pyproject.toml, rust/Cargo.toml, diff_diff/guides/llms-full.txt, and CITATION.cff (version: 3.2.0, date-released: 2026-04-19). CHANGELOG populated with Added / Changed / Fixed sections and the comparison-link footer. CITATION.cff retains v3.1.3 versioned DOI in identifiers; the v3.2.0 versioned DOI will be minted by Zenodo on GitHub Release and added in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
diff_diff.mse_optimal_bandwidth+BandwidthResultdataclass — the Phase 1b MSE-optimal bandwidth selector for the HAD (Heterogeneous Adoption Design) nonparametric estimator.36e4e53)lpbwselect(bwselect="mse-dpi")intodiff_diff/_nprobust_port.py(~500 lines):kernel_W,qrXXinv,lprobust_res,lprobust_vce,lprobust_bw,lpbwselect_mse_dpi. Three-stage DPI with four nestedlprobust.bwcalls at ordersq+1,q+2,q,p.benchmarks/R/generate_nprobust_golden.Rproducesbenchmarks/data/nprobust_mse_dpi_golden.jsonwith every stage bandwidth + per-stage(V, B1, B2, R). Python port matches R at 0.0000% relative error on all five stage bandwidths across three deterministic DGPs (uniform, Beta(2,2), half-normal).vce="nn",nnmatch=3,p=1,deriv=0,interior=False); the broader port surface is available via_nprobust_portfor Phase 1c.docs/methodology/REGISTRY.mdwith explicit notes aboutweights=being unsupported (no nprobust parity anchor) and the public API scope restriction.Methodology references (required if estimator / math changes)
nprobust0.5.0 CRAN package (GitHub SHA36e4e532d2f7d23d4dc6e162575cca79e0927cda,github.com/nppackages/nprobust); the R source atnpfunctions.R:187-288(lprobust.bw) and:498-607(lpbwselect.mse.dpi) is cited inline in the Python port.docs/methodology/REGISTRY.md#heterogeneousadoptiondid.weights=raisesNotImplementedErrorin the public wrapper.nprobust::lpbwselecthas no weight argument, so there is no parity anchor. Deferred to Phase 2 (survey-design adaptation). Documented in REGISTRY and docstring.vce="nn",nnmatch=3,p=1,deriv=0,interior=Falseare hard-coded inmse_optimal_bandwidth). The underlying port supports a broader surface but those paths are not parity-tested. Callers needing the broader surface go throughdiff_diff._nprobust_port.lpbwselect_mse_dpi. Documented in REGISTRY and docstring; pinned by a behavioral test.Validation
tests/test_nprobust_port.py(NEW, 14 tests) — unit tests for every ported helper (kernel, Cholesky-inverse, NN residuals, clustered/unclustered meat, single-stagelprobust_bwparity).tests/test_bandwidth_selector.py(NEW, 78 tests via parametrize) — 5-bandwidth parity on 3 DGPs at 1% + per-stage(V, B1, B2)parity + validation (shapes, finiteness, unknown kernel, weights, bwcheck range) + API-scope pin test + kernel-dispatch + boundary + downstream-integration + rate-scaling (G^{-1/5}MC).tests/test_local_linear.py— unchanged; still green as regression anchor for Phase 1a kernels.Security / privacy
🤖 Generated with Claude Code